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Abstract 

This  final  report  summarizes  the  PFs  research  in  three 
related  areas  of  research:  distributed  system  error  detec¬ 
tion,  reliable  internet  software  systems,  and  measurement- 
based  admission  control  for  guaranteed  quality  of  service 
networks. 

•  the  PI  implemented,  deployed,  and  maintained  a 
probabilistic  error  checker  (PEC)  for  the  Domain 
Name  System  (DNS).  He  then  re-applied  the  PEC 
concept  to  the  design  of  a  system  to  massively 
replicate  databases  around  the  Internet. 

•  The  PI  devised  a  measurement-based  call  admis¬ 
sion  control  algorithm  for  real-time  traffic  over  in¬ 
tegrated  services  packet  networks.  This  admis¬ 
sion  control  algorithm  achieve  doubles  the  perfor¬ 
mance  of  the  competition’s  algorithm  for  voice, 
video,  and  self-similar  traffic. 

•  The  PI,  in  collaboration  with  other  agencies,  ap¬ 
plied  his  reearch  to  the  design  of  the  Harvest  hi¬ 
erarchical  Internet  object  cache.  Besides  reducing 
network  traffic  from  routine  requests,  it  improves 
object  availability  and  isolates  the  network  from 
accidently  looping  requests. 


Objectives 

This  project  developed  techniques  to  build  and  debug 
robust,  wide-area-network  computer  systems.  The  re¬ 
search  addressed  how  to 

•  uncover  hidden  implementation  errors  in  deployed 
software, 

•  isolate  the  effects  of  faulty  nodes  on  wide-area- 
network  performance, 

•  most  efficiently  use  limited  network  bandwidth. 


The  PI  developed  tools  to  diagnose  and  resolve  per¬ 
formance  and  design  problems  of  heterogeneous,  au¬ 
tonomously  managed,  distributed  systems,  focusing  at¬ 
tention  on  key  Internet  components,  including  the  Do¬ 
main  Name  System  (DNS)  and  the  WWW,  gopher,  and 
FTP  services. 

The  research  was  based  on  the  premise  that  dis¬ 
tributed  systems  fail  in  stereotypical  ways:  they  dead¬ 
lock,  they  fail  half-way  through  an  operation,  their  net¬ 
works  partition  or  exhibit  asymmetric  communication 
paths,  their  logic  gets  caught  in  loops.  The  PI  tested 
that  hypothesis  that  the  way  to  make  a  robust,  heteroge¬ 
neous  distributed  system  is  to  program  its  elements  so 
that  the  components  themselves  exhibit  this  behavior 
right  from  the  start.  Applying  his  ideas  to  software  pro¬ 
totypes,  the  PI  demonstrated  that  when  components  of 
a  distributed  system  occasionally  mimic  their  own  fail¬ 
ure  during  normal  service,  that  the  system  can  be  de¬ 
signed  to  diagnose  implementation  and  design  mistakes 
in  other  system  components. 


Summary  of  Research  Effort 

The  PI  his  DNS  checker  and  probabilistic  error  checker 
software  and  explored  the  probabilistic  checker  tech¬ 
nique  in  the  Harvest  wide-area-network  data  replicator. 
He  integrated  the  ideas  behind  checker  into  the  Harvest 
Web  cache.  ISI  is  proposing  to  regain  administrative 
control  of  the  Internet’s  root  name  servers  and  deploy 
the  Pi’s  DNS  checker  software. 

In  a  separate  research  thrust,  he  developed  network 
conversation  admission  control  algorithm  for  multi-media 
network  applications. 

Probabilistic  Error  Checkers 

The  first  goal  of  this  project  was  to  create  diagnos¬ 
tic  software  that  identifies  broken  components  of  dis¬ 
tributed  services  under  normal  service  and  occasionally 
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stress  tests  particular  components  to  explore  their  be- 
havior  under  duress.  As  an  example  of  such  software, 
the  PI  implemented  a  diagnostic  tool  for  name  server 
traffic.  This  checker  program  analyzed  the  interaction 
between  a  name  server  and  its  possibly  mis-behaving 
clients. 

The  PPs  probabilistic  error  checker  for  DNS  demon¬ 
strates  that  a  simple  module  can  identify  misbehav¬ 
ing  components  of  a  functioning  distributed  system  by 
feigning  death  and  do  so  without  breaking  the  system. 
In  contrast  to  the  full  DNS  checker  the  probabilistic 
checker  is  significantly  smaller,  simpler,  less  memory  de¬ 
manding,  but  still  capable  of  diagnosing  broken  compo¬ 
nents  [2]. 

This  software  is  deployed  on  a  secondary  of  the  ”  .us” 
domain  and  can  be  queried 

from  http: //excalibur, usc.edu/research/checker.  The  soft¬ 
ware  and  instruction  manuals  for  the  DNS  checker  soft¬ 
ware  are  available  from 

ftp://catarina.usc.edu/puh/danzig/hoTne.htmi 


Hierarchical  Object  Cache 


The  explosion  of  interest  in  the  Internet  information 
“Mosaic”  has  taxed  the  Internet’s  WWW,  gopher,  and 
FTP  servers.  Jointly  with  funding  from  Hughes  infor¬ 
mation  Systems  (via  a  NASA  subcontract)  and  directly 
from  ARPA,  the  PI  built  the  Harvest  Object  Cache, 
which  is  architecturally  similar  to  DNS,  but  designed  to 
avoid  the  flaws  that  uncovered  in  DNS  [1].  The  cache 
has  lead  to  an  international  collaboration  of  several  hun¬ 
dred  users  http://www.nlanr.net/Cache 

When  a  referenced  object  causes  a  cache  miss,  the 
cache  estimates  its  network  distance  to  the  object’s  home 
node  and  checks  to  see  if  the  object  would  hit  the  cache’s 
immediate  parents  and  siblings.  If  the  object’s  home 
node  is  closer  than  in  any  sibling  or  parent  cache  for 
which  the  object  is  a  hit,  the  cache  fetches  the  object 
directly.  Otherwise,  if  the  object  is  a  miss  in  all  siblings 
and  parents,  then  the  cache  fetches  the  object  through 
the  closest  parent  cache.  By  closest  cache,  we  mean  the 
parent  cache  with  shortest  round  trip  time.  An  example 
cache  topology  is  shown  in  Figure  1. 


Figure  1:  Hierarchical  object  cache  for  the  Internet, 


sources  to  users.  While  current  IP  and  ATM  networks 
do  not  yet  have  this  infrastructure  in  place,  they  even¬ 
tually  will. 

Existing  admission  control  algorithms  require  that 
network  users  accurately  specify  their  quality  of  service 
(QOS)  requirements.  The  admission  control  algorithm 
then  translates  this  QOS  specification  into  bandwidth, 
buffer  space,  and  switching  priority  reservations  at  each 
network  switch  and  link. 

The  PI  is  currently  developing  a  measurement- based 
admission  control  algorithm.  The  user  still  states  his 
quality  of  service  requirements,  but  the  network  mea¬ 
sures  the  network’s  total  resource  usage  and  attempts 
to  over-book  the  reservations,  to  incrccise  network  uti¬ 
lization.  The  admission  control  algorithm  works  with 
two  classes  of  service:  guaranteed  service  and  predic¬ 
tive  service.  Guaranteed  and  predictive  service,  like  the 
names  imply,  are  comparable  to  hard  and  soft-real  time 
systems.  Guaranteed  service  reserves  more  resources 
than  predictive  service,  but  the  network  always  meets 
requested  QOS. 

The  goal  of  our  admission  control  research,  like  our 
cache  and  replicator  research,  is  to  more  efficiently  use 
network  bandwidth. 


Real-Time,  Multi- media  Networks 

The  PI  developed  a  measurement-based,  conversation 
admission  control  algorithm  for  real-time,  multi- media 
networks  [4,  3].  The  applies  to  integrated  packet  net¬ 
works,  whether  IP  or  ATM.  To  provide  real-time,  multi- 
media  services,  the  network  must  assign  and  reserve  re- 


Personnel 

The  following  students  were  funded  by  this  AFOSR  project: 
•  Steve  Miller,  who  wrote  the  DNS  checker, 
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Charles  Chan,  who  wrote  the  probabilistic  error 
checker  for  DNS  and  who  created  the  small-memory 
version  of  checker,  is  now  employed  at  Motorola. 

Katia  Obraczka,  who  recently  completed  her  Ph.D. 
dissertation  on  ’’Massively  Replicating  Services  in 
Wide  Area  Internetworks”,  now  works  at  ISI. 

Erhyuan  Tsai  debugged  our  mirrord  replication 
tool  using  a  probabilistic  error  checker.  Now  works 
at  Netscape. 

Anawat  Chankhunthod  and  Chuck  Neerdales  worked 
on  the  hierarchical  object  cache.  Chuck  now  works 
for  Netscape.  Anawat  is  still  a  PhD  student. 

Sugih  Jamin  passed  his  Ph.D.  thesis  defenese  on 
”A  Measurement“based  Admission  Control  Algo¬ 
rithm  for  Integrated  Services  Packet  Networks”. 
He  won  best  student  paper  prize  at  SIGCOMM 
95  and  is  now  interview  for  faculty  and  research 
lab  positions. 


[3]  Sugih  Jamin,  Peter  B.  Danzig,  Scott  Shenker,  and 
Lixia  Zhang.  A  measurement- based  admission  con¬ 
trol  algorithm  for  integrated  services  packet  net¬ 
works.  To  appear:  ACM  Transactions  on  Network¬ 
ing,  1996. 

[4]  Sugih  Jamin,  Peter  B.  Danzig,  Scott  Shenker,  and 
Lixia  Zhang.  A  measurement-based  admission  con¬ 
trol  algorithm  for  integrated  services  packet  net¬ 
works,  SIGCOMM  95,  August  1995. 


Awards 

•  received  best-student  paper  award  1995  ACM  SIG¬ 
COMM 

•  served  on  the  1993, 1995,  &  1996  ACM  SIGCOMM 
program  committee 

•  served  on  the  1996  IEEE  INFOCOMM  program 
committee 

•  served  on  an  NS F  Networking  Research  Panel,  De¬ 
cember  1994  &  March  1996 

•  served  on  the  1993  &  1996  ACM  SIGMETRICS 
program  committees 

•  received  the  NSF  NYI  award,  1994. 

•  Served  as  associate  editor,  Journal  of  Internet¬ 
working 
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